“Datapasta” for easy copy and paste in R

Introduction

Struggling to scrape a table from the web and format it correctly? Don’t worry—datapasta is here to simplify the process! This handy library lets you effortlessly bring data into R without extensive coding. While it’s best suited for small tables rather than large datasets, it’s a huge time-saver for quick tasks. Let’s understand how to use it step by step.

For this demonstration, we’ll fetch a table from the following site: click here

Let us begin with installing and loading the required package

install.packages("datapasta")
library(datapasta)

Once, “datapasta” is installed restart R for it to reflect in Addins (as shown below)

Step 1: Select and copy table

For demonstration purpose we are going to fetch a table from site: click here

Note: The table can be fetched from website, word, spreadsheets, CSV files, or structured text. However, it does not have built-in capabilities to work directly with PDF files (Incase, you need to extract tables from PDF refer to our previous article: Importing and extracting tables from PDF into R using “pdftools” )

Step 2: Paste the data in R

Go to Addins > Select the option of your choice e.g: “Paste as tribble” or “Paste as data.frame”

# Data pasta directly refers to "tibble" library as shown below, incase it doesn't you will need to load it explicitly 
Table_as_tribble <- tibble::tribble(
                                     ~Company,           ~Contact,  ~Country,
                        "Alfreds Futterkiste",     "Maria Anders", "Germany",
                 "Centro comercial Moctezuma",  "Francisco Chang",  "Mexico",
                               "Ernst Handel",    "Roland Mendel", "Austria",
                             "Island Trading",    "Helen Bennett",      "UK",
               "Laughing Bacchus Winecellars",  "Yoshi Tannamuri",  "Canada",
               "Magazzini Alimentari Riuniti", "Giovanni Rovelli",   "Italy"
               )

print(Table_as_tribble)
# A tibble: 6 × 3
  Company                      Contact          Country
  <chr>                        <chr>            <chr>  
1 Alfreds Futterkiste          Maria Anders     Germany
2 Centro comercial Moctezuma   Francisco Chang  Mexico 
3 Ernst Handel                 Roland Mendel    Austria
4 Island Trading               Helen Bennett    UK     
5 Laughing Bacchus Winecellars Yoshi Tannamuri  Canada 
6 Magazzini Alimentari Riuniti Giovanni Rovelli Italy  
# Paste as dataframe
Table_as_df <-  data.frame(
  stringsAsFactors = FALSE,
                          Company = c("Alfreds Futterkiste",
                                      "Centro comercial Moctezuma","Ernst Handel","Island Trading",
                                      "Laughing Bacchus Winecellars",
                                      "Magazzini Alimentari Riuniti"),
                          Contact = c("Maria Anders","Francisco Chang",
                                      "Roland Mendel","Helen Bennett",
                                      "Yoshi Tannamuri","Giovanni Rovelli"),
                          Country = c("Germany","Mexico","Austria","UK",
                                      "Canada","Italy")
               )
print(Table_as_df)
                       Company          Contact Country
1          Alfreds Futterkiste     Maria Anders Germany
2   Centro comercial Moctezuma  Francisco Chang  Mexico
3                 Ernst Handel    Roland Mendel Austria
4               Island Trading    Helen Bennett      UK
5 Laughing Bacchus Winecellars  Yoshi Tannamuri  Canada
6 Magazzini Alimentari Riuniti Giovanni Rovelli   Italy

Beyond Pasting Data: Aligning and Formatting Made Easy

For instance, imagine you’ve written a long vector but forgot to add quotes around the elements. Manually adding the quotes can be time-consuming. Consider the example:
c(Germany, Mexico, Austria, UK, Canada, Italy)

This will result in the error: “Error: object ‘Germany’ not found”

With datapasta, you can quickly fix such issues, saving both time and effort.

# Select the vector > Addins > In Datapasta select "Toggle Vector Quotes"
# Voila your vector is now formatted correctly
c("Germany","Mexico","Austria","UK","Canada","Italy")
[1] "Germany" "Mexico"  "Austria" "UK"      "Canada"  "Italy"  
# Similary using "Fiddle Selection" will align your code
c("Germany",
  "Mexico",
  "Austria",
  "UK",
  "Canada",
  "Italy")
[1] "Germany" "Mexico"  "Austria" "UK"      "Canada"  "Italy"  

Step 3: Set shortcuts in Addins

  • Go to Tools > Keyboard Shorcuts… > In shortcut column add necessary shortcut > Apply

Conclusion:

When it comes to copying and pasting small datasets or converting data into R-friendly formats, the datapasta package is a priceless tool for streamlining data operations in R.

Benefits:

  • Time-Saving

  • Ease of Use

Cons:

  • Limited to Small Datasets

  • Niche Use Case: Its primary focus is on quick data entry means it is less relevant for those who work with larger datasets or established pipelines.